Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAS data column custody tracker #5469

Closed
wants to merge 4 commits into from

Conversation

eserilev
Copy link
Collaborator

Issue Addressed

Add a data column custody tracker field to BeaconChain. At the moment this field is only updated once, during network service initialization. In a future iteration, the data column subnet service will handle custody requirement rotations at epoch boundaries, peer discovery, and data column subnet subscriptions/un-subscriptions. This PR just aims to quickly make data column info available on the beacon chain to unblock DAS related work.

@eserilev eserilev requested a review from jimmygchen March 23, 2024 13:08
@eserilev eserilev added das Data Availability Sampling ready-for-review The code is ready for review labels Mar 23, 2024
@eserilev
Copy link
Collaborator Author

Since epoch rotation wont necessarily happen at epoch boundaries tracking column custody requirements based on the current epoch no longer makes sense. I've updated data_column_custody_tracker to just store the current custody requirement list.

@dapplion
Copy link
Collaborator

dapplion commented Apr 3, 2024

If I understand correctly, the intention is for the required custody column subnet subscriptions to match our long-term attestation subnet subscriptions. Such that their sets either match or overlap and we don't need complicated logic to rank peers based on orthogonal requirements. Is that correct @AgeManning ?

@AgeManning
Copy link
Member

Yep, i'm not sure about the timing yet. It may not get pinned to the attestations depending on what other client teams think.

However, it should be easy and we won't need fancy logic. Essentially we can add a method to network_globals that returns which data columns any given peer should be on. That function is somewhat trivial and we can calculate it on the fly (I dont think we'll need a data structure to pre-calculate this). Essentially given a peer-id we can easily calculate the data-columns. This can only rotate on an epoch boundary, there are no rotations mid-epochs.

The harder structure is given a data-column to obtain a list of peer-id prefixes we might want to search for. This does require some pre-calculation and some data to be stored. But its mostly done in this PR: #5090

If this PR helps unblock work i'm all for it. It will likely be simplified in the future. I'll push a bit harder for the spec change so we can solidify it in lighthouse.

@jimmygchen
Copy link
Member

I had a chat with @dapplion today and he raised a good point about why setting the custody columns on beacon_chain is insufficient. When we perform sampling during sync, we'd also need to know about the custody_columns for that past epoch, so there needs to be a way to compute this on the fly in BeaconChain or sync rather than relying on network to set the latest value on the beacon_chain.

We've talked about a few alternatives:

  • Having BeaconChain load the private key during construction. This would require BeaconChain to have knowledge of deriving private key -> node_id -> custody column though, and we'd also need to maintain the custody_columns separately in both network and beacon_chain
  • Having SyncNetworkContext always providing beacon_chain with the list of custody columns

We could potentially pass a closure Fn(epoch) -> Vec[DataColumnSubnetId] to the beacon chain so that it can compute the subnet ids for any epoch without beacon chain having to know about the node_id and derivation logic.

@dapplion
Copy link
Collaborator

dapplion commented Apr 9, 2024

image

@jimmygchen
Copy link
Member

Yeah so we've decided not to pass the 32 bytes to the beacon chain after a lengthy discussion with the network devs 🤣

I'll draft up a PR shortly - the solution proposed is below:

Custody Columns from gossip

  • After receiving a gossip verfieid data column, we wrap it in a CustodyDataColumn object and send it to DAChecker. (nodes only subscribe to subnets they custody)
  • DAChecker is aware of how many columns it's custodying, so once it collects enough distinct CustodyDataColumns then custody requirement is fullfiled.

Custody Columns from RPC

  • If we receive an attestation with unknown block, this triggers a block and data column lookup for the root from sync (sync can compute the custody columns)
  • When response for this request type (RequestId::SingleDataColumn) is received, sync wraps the data in a CustodyDataColumn` and send it to DAChecker
  • DAChecker is aware of how many columns it's custodying, so once it collects enough distinct CustodyDataColumns then custody requirement is fullfiled.
  • For sampling responses, I guess we would have a different RequestId to differentiate, e.g. RequestId::Sampling?

Custody Columns from HTTP API

  • http_api has access to network_globals and can calculate which columns need to custody, so we can wrap them the same way and send them to the beacon chain.

@jimmygchen
Copy link
Member

@eserilev Thanks a lot for the PR - it's been helpful triggering these discussions. I will close this one now and draft up the new proposal mentioned above.

@jimmygchen jimmygchen closed this Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
das Data Availability Sampling ready-for-review The code is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants